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Abstract. In this paper, we propose a dimension reduction model for spatially depen- 
dent variables. Namely, we investigate an extension of the inverse regression method 
under strong mixing condition. This method is based on estimation of the matrix of 
covariance of the expectation of the explanatory given the dependent variable, called the 
inverse regression. Then, we study, under strong mixing condition, the weak and strong 
consistency of this estimate, using a kernel estimate of the inverse regression. We provide 
the asymptotic behaviour of this estimate. A spatial predictor based on this dimension 
reduction approach is also proposed. This latter appears as an alternative to the spatial 
non-parametric predictor. 

Keywords: Kernel estimator; Spatial regression; Random fields; Strong mixing coef- 
ficient; Dimension reduction; Inverse Regression. 

1. Introduction 

Spatial statistics includes any techniques which study phenomenons observed on spatial 
subset S of M^, N >2 (generally, = 2 or = 3). The set 5* can be discret, continuous 
or the set of realization of a point process. Such techniques have various applications in 
several domains such as soil science, geology, oceanography, econometrics, epidemiology, 
forestry and many others (see for example [27], [H] or [18] for exposition, methods and 
applications). 

Most often, spatial data are dependents and any spatial model must be able to handle 
this aspect. The novelty of this dependency unlike the time-dependency, is the lack of 
order relation. In fact, notions of past, present and futur does not exist in space and this 
property gives great flexibility in spatial modelling. 

In the case of spatial regression that interests us, there is an abundant literature on 
parametric models. We refer for example to the spatial regression models with correlated 
errors often used in economics (see e.g. Anselin and Florax [2], Anselin and Bera [1], Song 
and Lee [29]) or to the spatial Generalized Linear Model (GLM) study in Diggle et al. 
[T4] and Zhang [36]. Recall also the spatial Poisson regression methods which have been 
proposed for epidemiological data (see for example Diggle [13] or Diggle et al [T4]). 
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Unlike the parametric case, the spatial regression on nonparametric setting have been 
studied by a few paper: quote for example Biau and Cadre |5], Lu and Chen [25], Hallin 
et al. [H], Carbon et al. [9], Tran and Yakowitz [32] and Dabo-Niang and Yao [T2] . 
Their results show that, as in the i.i.d. case, the spatial nonparametric estimator of the 
regression function is penalized by the dimension of the regressor. This is the spatial 
counterpart of the well-known problem called ''the curse of dimensionality". Recall that 
dimension reduction methods are classically used to overcome this issue. Observing an 
i.i.d. sample Zi = (Xj, Yi) the aim is to estimate the regression function m{x) = Ei{Y\X = 
x). In the dimension reduction framework, one assumes that there exist $ an orthonormal 
matrix d x D, with D as small as possible, and g : — > M, an unknown function such 
that the function m(.) can be written as 

(1.1) m{x) = g{^.X). 

Model (11.11) conveys the idea that "less information on X" , $ .X; provides as much infor- 
mation on m(.) as X. The function g is the regression function of Y given the D dimen- 
sional vector $.X. Estimating the matrix $ and then the function g (by nonparametric 
methods) provides an estimator which converges faster than the initial nonparametric 
estimator. The operator $ is unique under orthogonal transformation. An estimation of 
this latter is done through an estimation of his range Im($^) (where is the transpose 
of $) called Effective Dimensional Reduction space (EDR). 

Various methods for dimension reduction exist in the literature for i.i.d observations. 
For example we refer to the multiple linear regression, the generalized linear model (GLM) 
in [8], the additive models (see e.g. Hastie and Tibshirani [21]) deal with methods based 
on estimation of the gradient of the regression function m(.) developped in for example 
in [22] or [35]. 

In this paper, we focus on the inverse regression method, proposed by Li [24]: if X is 
such that for all vector b in M'^, there exists a vector B of MP such that E(6-^X|$.X) = 
i?^($.X) (this latter condition is satisfied as soon as X is elliptically distributed), then, 
if S denotes the variance of X, the space Im(S~^var(E(X|F)) is included into the EDR 
space. Moreover, the two spaces coincide if the matrix S~ Var(E(X|F)) is of full rank. 
Hence, the estimation of the EDR space is essentially based on the estimation of the 
covariance matrix of the inverse regression E(X|y) and S which is estimated by using a 
classical empirical estimator. In his initial version, Li suggested an estimator based on the 
regressogram estimate of E(X|F) but drawbacks of the regressogram lead other authors 
to suggest alternatives based on the nonparametric estimation of EX|y, see for instance 
|23) or [37] which enable to recover the optimal rate of convergence in \/n. 
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This work is motivated by the fact that to our knowledge, there is no inverse regression 
method estimation for spatially dependent data under strong mixing condition. Note 
however that a dimension reduction method for supervised motion segmentation based 
on spatial-frequential analysis called Dynamic Sliced Inverse Regression (DSIR) has been 
proposed by Wu and Lu [34]. We propose here a spatial counterpart of the estimating 
method of [37] which uses kernel estimation of EX|F. Other methods based on other 
spatial estimators of EX|F will be the subject of futher investigation. 

As any spatial model, a spatial dimension reduction model must take into account 
spatial dependency. In this work, we focus on an estimation on model (11.11) for spatial 
dependent data under strong mixing conditions. The spatial kernel regression estimation 
of EX I F being studied in [HI UHl [9] . 

An important problem in spatial modelling is that of spatial prediction. The aim being 
reconstruction of a random field over some domain from a set of observed values. It is 
such a problem that interest us in the last part of this paper. More precisely, we will use 
the properties of the inverse regression method to build a dimension reduction predictor 
which corresponds to the nonparametric predictor of [5]. It is an interesting alternative to 
parametric predictor methods such as the krigging methods (see e.g. [33], [H]) or spatial 
autoregressive model (see for example [TT]) since it does not requires any underlying 
model. It only requires the knowledge of the number of the neighbors. We will see that 
the property of the inverse regression method provides a way of estimating this number. 

This paper falls into the following parts. Section [2] provides some notations and as- 
sumptions on the spatial process, as well as some preliminar results on U-statistics. The 
estimation method and the consistency results are presented in Section [3l Section [4] uses 
this estimate to forecast a spatial process. Section [5] is devoted to Conclusion. Proofs and 
the technical lemmas are gathered in Section [6l 



2. General setting and preliminary Results 

2.1. Notations and assumptions. Throughout all the paper, we will use the following 
notations. 

For all 6 G R'^ , b^^^ will denote the j*^ component of the vector b; 

a point in bold i = {ii,...,iM) G n G (N*)^ will be referred to as a site, we will set 
Ijv = (1, 1 ); if n = (rii, un), we will set n = rii x ... x n^- and write n +oo if 
Af times 

minj=i N^i ^ +00 and — < C for some constant C > 0. 

The symbol ||.|| will denote any norm over M'^ , \\u\\^ = sup^ |w(x)| for some function u 
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{1 if X E A 
otherwise 

The notation Wn = OpiVn) (respectively = Oa.s{Vn)) means that = V^Sn for a 
sequence S^, which is bounded in probability (respectively almost surely). 

We are interested in some M'^ x M-valued stationary and measurable random field Zi = 
(Xi,Fi), i G (N*)^, (A^, d > 1) defined on a probability space {Q, A,P). Without loss 
of generality, we consider estimations based on observations of the process (Zi, i G Z^) 
on some rectangular set Xn = |i = (ii, i^v) ^ Z^, 1 < < n^, k = 1, for all n G 

Assume that the Z^s have the same distribution as (X, Y) which is such that: 

• the variable Y has a density /. 

• Vj = 1, ...,d each component X^^'> of X, is such that the pair (X^^\Y) admits an 
unknown density fxu) y with respect to Lebesgue measure A over and each 
X'^^^ is integrable. 

2.2. Spatial dependency. 



As mentionned above, our model as any spatial model must take into account spatial 
dependence between values at differents locations. Of course, we could consider that there 
is a global linear relationships between locations as it is generally done in spatial linear 
modeling, we prefer to use a nonlinear spatial dependency measure. Actually, in many 
circumstances the spatial dependency is not necessarly linear (see [3]). It is, for example, 
the classical case where one deals with the spatial pattern of extreme events such as in the 
economic analysis of poverty, in the environmental science,... Then, it is more appropriate 
to use a nonlinear spatial dependency measure such as positive dependency (see |3]) or 
strong mixing coefficients concept (see Tran [3T]). In our case, we will measure the 
spatial dependency of the concerned process by means of a— mixing and local dependency 
measure. 

2.2.1. Mixing condition : 

The field (Zi) is said to satisfy a mixing condition if: 

• there exists a function X : IR+ IR+ with X(t) I as t ^ oo, such that whenever 
S, S' c (Wf , 

a{l3{S),B{S')) = sup \P{BnC) - P{B)P{C)\ 

AeB{S), BeB{S') 

(2.1) < ipiCavdS, Cards') A'(dist(^, S')) 
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where B{S){resp. B{S')) denotes the Borel a— fields generated by [Zi, i E S) {resp. 
(Zi, i G S")), Cards' {resp. CardS") the cardinality of S{resp. S"), dist(S', S") the 
Euclidean distance between S and S", and ^ : — is a symmetric positive 
function nondecreasing in each variable. If ^ = 1, then Zi is called strong mixing. 
It is this latter case which will be tackled in this paper and for all f > 0, we have 

a{v)= sup a {a (Zi) , a (Zj)) < X (v) . 

iJeK^.lii-jlN-" 

• The process is said to be Geometrically Strong Mixing (GSM) if there exists a 
non-negative constant p G [0, 1[ such that for all u > 0, a;(n) < Cp" . 

Remark. A lot of published results have shown that the mixing condition (12. ip is satis- 
fied by many time series and spatial random processes (see e.g. Tran j3lj, Guyon \T8\ . 
Rosenblatt [28], Doukhan [15]). Moreover, the results presented in this paper could be 
extended under additional technical assumptions to the case, often considered in the lit- 
erature, where ip satisfies: 

^(i,j)<cmin(i,j), Vi, j GM, 

for some constant c > 0. 

In the following, we will consider the case where a{u) < Cu^^, for some 9 > 0. But, 
the results can be easly extend to the GSM case. 



2.2.2. Local dependency measure. 

In order to obtain the same rate of convergence as in the i.i.d case, one requires an 
other dependency measure, called a local dependency measure. Assume that 

• For i = I, ...,d, there exits a constant A > such that the pairs {xl^\ Xj) and 
((Xf\ Fi), {X^/\ Yj)) admit densities /ij and (7i j, as soon as dist(i, j) > A, such 
that 

\fi,i{x,y)-f{x)f{y)\<C, \fx,y gM 
{u, v)-g {u) g{v)\< C, Vu, t; G R2 
for some constant C > 0. 

Remark. The link between the two dependency measures can be found in Bosq [7]. 

Note that if the second measure (as is name point out) is used to control the local 
dependence, the first one is a kind of "asymptotic dependency" control. 
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Let {Xn, n > 1) be a sequence of real- valued random variables with the same distribu- 
tion as F. Let the functional: 

e(F) = / h{xi,X2,...,x^)dF{xi)...dF{xm), 

where m G N, h{.) is some measurable function, called the kernel and F is a distribu- 
tion function from some given set of distribution function. Without loss of generality, 
we can assume that h{.) is invariable by permutation. Otherwise, the transformation 
^. Ei<ii^i2/...^i^<n hi^h, --^XiJ will provide a symmetric kernel. 

A [/—statistic with kernel h{.) of degree m based on the sample (Xj, 1 < i < n) is a 
statistic defined by: 

(n — m)! 



Un — j h{Xi .,Xi 



It is said to be an m— order [/—statistic. Let/ii(xi) = f^m-i h{xi,X2, ...,Xm)YYjL2^^i^j)■ 
^he next Lemma is a consequence of Lemma 2.6 of Sun & Chian |3Q| . 



Lemma 2.1. Let (X„, n > 1) be a stationary sequence of strongly mixing random 
variables. If there exists a positive number 6 and 6' {0 < 6' < 6) verifying 7 = (4^s){2+5') ^ 
1 such that 

(2.2) ||/i(Xi,...,Xj||4+5<oo, 



iXj) < 00, 



n III, 

(2.3) / \h{x^,...,x^)\^-''lldFi 

and a{n) = C)(n-3{4+5')/(2+5')) _ j^}^^^^ 

2 " 1 
Un = Q{F) + - V (/ii(X,) - e(F)) + Op(-). 

n ^ — ^ n 

i=l 

To give strong consistency results, we need the following law of the iterated logarithm 
of U-statistics: 

Lemma 2.2. (Sun & Chian, [30j j Under the same conditions of the previous lemma, we 
have 

c/„-e(f) = §f:(M(A-,)-e(F)) + a ( f^^- 

Remark 2.3. 



n ^ — ' \ V n 

i=l 
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• In the following, we are dealing with a kernel h{.) = Klj^) which depends on 
n. Actually, it is a classical approach to use [/—statistics result to get some 
assymptotic results of kernel estimators, in the i.i.d case, we refer for example 
Hardle and Stoker [20]. In fact, the dependence of on n does not influence the 
asymptotical results presented here. 



3. Estimation of the covariance of Inverse Regression Estimator 

We suppose that one deals with a random field {Zi, i G Z^) which, corresponds, in the 
spatial regression case, to observations of the form Zi = (Xi,Fi), i G Z^, (A^ > 1) at 
different locations of a subset of M^, > 1 with some dependency structure. Here, we 
are particularly interested with the case where the locations take place in lattices of M^. 
The general continuous case will be the subject of a forthcoming work. 

We deal with the estimation of the matrix Eg = varE(X|F) based on the observations 
of the process: (Zi, i G Xn) ; n ^ (N*)^. In order to ensure the existence of the matrix 
E = varX and He = varE(X|y), we assume that EllXH*^ < oo. For sake of simplicity 
we will consider centered process so EX = 0. 

To estimate model (11.11) . as previously mentioned, one needs to estimate the matrix 
E~^Se. On the one hand, we can estimate the variance matrix S by the empirical spatial 
estimator, whose consistency will be easily obtained. On the other hand, the estimation 
of the matrix Eg is delicate since it requires the study of the consistency of a suitable 
estimator of the (inverse) regression function of X given Y: 

{— if f(v) ^0- / /■ \ 

4^5 . ' , ' where if{y) = / x«/x(0,y(x«, l<z<d],yeR. 
EY iff{y) = ) 

An estimator of the inverse regression function r(.), based on (Zi, i G Xn) is given by 



\T.:^tJ-^ if/n(y)=0. 



with for all y G 



hy 



where /„ is a kernel estimator of the density, K : R'^ — M is a bounded integrable kernel 
such that J K {x) dx = 1 and the bandwidth /in > is such that lim„^+oo = 0. 
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The consistency of the estimators /„ and has been studied by Carbon et al ^0\. To 
prevent small-valued density observations y, we consider the following density estimator: 



feAv) = max(en,/n(y)) 

where (cn) is a real- valued sequence such that limn^oo Cn 
corresponding estimator of r 



0. Then, we consider the 



feAv)' 



Finally, for X = 4 consider the estimator of Eg: 

n ' ' 



We aim at proving the consistency of the empirical variance associated to this estimator. 

Remark. Here, we consider as estimator of the density /, /e,n = max(en, /n), to avoid 
small values. There are other alternatives such as /e,n = fn+^n or /e,n = max{(/n— 0}. 

3.1. Weak consistency. In the following, for a fixed r] > and a random variable Z 
in R'^, we will use the notation \\Z\\^ = E(| |Z| 

In this section, we will make the following technical assumptions 



(3.1) 

and 
(3.2) 



r{Y) 



r{Y) 



f{Y) 



{/(n<en} 



< oo, for some 6i > 



4+5i 



o 



n 2 



for some 1 > 5 > 0. 



These assumptions are the spatial counterparts of respectively ||?"(^)||4,5 < oo and 



O 



needed in the i.i.d case. 



We also assume some regularity conditions on the functions: K{.), /(.) and r(.) 



The kernel function K{.) : R IR+ is a A;— order kernel with compact support 
and satisfying a Lipschitz condition \K (x) — K {y)\ < C\x — y\ 
/(.) and r(.) are functions of C'^iM) {k > 2) such that sup^^ !/('')(?/) | < Ci and 
sup ||v5*-'^-'(y)|| < C2 for some constants Ci and C2, 



Set ^„ = /i^ + 



ign 



Theorem 3.1. Assume that a{t) < Cr^ t > 0, 9 > 2N and C > 0. IfE{\\X\\) < 00 and 
= E(||X|p|F = .) is continuous. Then for a choice of such that nh^{logn)^^ 
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and nh^{\ogn)~^ oo with 9i = fr^, then, we get 

Corollary 3.2. Under Assumptions of Theorem \3.1\ with h ~ n~'^^ , Cn — n~'^^ for some 
positive constants ci and C2 such that ^ + ^<ci<| — 2c2, we have 

Ee,n Op 

Corollary 3.3. (Central limit theorem) Under previous assumptions, we have 

Vil (Se,n - Se) ^ A 

where A is a zero-mean gaussian on the space of d-order matrix with covariance 
var {r{Y)r{Yf) . 




3.2. Strong consistency. 

Here we study the case where the response, Y takes values in some compact set. We 



replace the assumption 



r(Y) , 

7{y)l{/{^)<en} 



^[-d^J byE(exp(||r(y)|| l{/(y)<e„})) 



O (n ^) for some ^ > 0. : Eexp7||X|| < oo for some constant 7 > 0. 

Theorem 3.4. // (Zu) is GSM, for a choice of h^ such that nh^(\ogn)~^ and 
n/in(logn)~^^^^ —>■ 00. Assume also that inis f{y) > for some compact set S, then 
under the Assumptions of Lemma \2.1\ we have: 

f k K 



el 



Corollary 3.5. Under previous Assumptions, with — (n) , — ii '^^ for some 



positive constants Ci and C2 such that % + ^<Ci<i — 2c2, we get 




log log n 



n 



As mentionned previously, the eigenvectors associated with the positive eigenvalues of 
E~^Ee,n provide an estimation of the EDR space. Classically, weak and strong consistency 
results concerning the estimation of the EDR space are obtained by using the previous 
consistency respectively of the S and Eg and the theory of perturbation as for example 
in [371. 
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4. Spatial inverse methode for spatial prediction 
4.1. Prediction of a spatial process. 

Let (^n, n G (N*)^) be a R— valued strictly stationary random spatial process, assumed 
to be observed over a subset On C In {In is a rectangular region as previously defined 
for some n G (N*)^). Our aim is to predict the square integrable value, ^i^, at a given 
site io E In — On- In practice, one expects that only depends on the values of the 
process on a bounded vicinity set (as small as possible) Vi^ C On, i.e that the process (.^i) 
is (at least locally) a Markov Random Field (MRF) according to some system of vicinity. 
Here, we will assume (without loss of generality) that the set of vicinity (Vj,j G (N*)^) 
is defined by Vj of the form j + V (call vicinity prediction in Biau and Cadre |5j). Then 
it is well known that the minimum mean-square error of prediction of given the data 
in Vi(, is 

i?(6oiei,ieViJ 

and we can consider as predictor any (i— dimensional vector (where d is the cardinal of V) 
of elements of Vig concatenated and ordered according to some order. Here, we choose the 
vector of values of (^n) which correspond to the d— nearest neighbors: for each i G Z^, 
we consider that the predictor is the vector = (^i(fc); 1 < k < d) where i(fc) is the k—th 
nearest neighbor of i. Then, our problem of prediction amounts to estimate : 

For this purpose we construct the associated process: 

Zi = (Xi,yi) = (ef,ei), iez^ 

and we consider the estimation of m(.) based on the data {Zi, G On) and the model 
p.ip . Note that the linear approximation of m(.) leads to linear predictors. The available 
literature on such spatial linear models (we invite the reader think of the kriging method 
or spatial auto-regressive method^ is relatively abundant, see for example, Guyon [18], 
Anselin and Florax [2], Cressie [11], Wackernagel [33]. In fact, the linear predictor is the 
optimal predictor (in mimimun mean square error meaning) when the random field under 
study is Gaussian. Then, linear techniques for spatial predicition, give unsatisfactory 
results when the the process is not Gaussian. In this latter case, other approaches such as 
log-normal kriging or the trans- Gaussian kriging have been introduced. These methods 
consist in transforming the original data into a Gaussian distributed data. But, such 
methods lead to outliers which appear as an effect of the heavy-tailed densities of the data 
and cannot be delete. Therefore, a specific consideration is needed. This can be done by 
using, for example, a nonparametric model. That is what is proposed by Biau and Cadre 
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[5] where a predictor based on kernel methods is developped. But, This latter (the kernel 
nonparametric predictor) as all kernel estimator is submitted to the so-called dimension 
curse and then is penalized by d (= card(V)), as highlighted in Section 1. Classically, 
as in Section 1, one uses dimension reduction such as the inverse regression method, to 
overcome this problem. We propose here an adaptation of the inverse regression method 
to get a dimension reduction predictor based on model f ll.ip : 

(4.1) ^, = g{^.^f). 

Remark 4.1. 

(1) To estimate this model, we need to check the SIR condition in the context of 
prediction i.e: X is such that for all vector b in W^, there exists a vector B of 
MP such that E(6^X|$.X) = B'^i^.X), is verify if the process (^i) is a spatial 
elliptically distributed process such as Gaussian random field. 

(2) In the time series forecasting problem, "inverse regression" property can be an 
"handicap", since then, one needs to estimate the expectation of the "future" given 
the "past". So, the process under study must be reversible. The flexibility that 
provide spatial modelling overcome this default since as mentioned in the intro- 
duction, the notion of past, present and future does not exist. 

At this stage, one can use the method of estimation of the model (11.11) given in Section 
1 to get a predictor. Unfortunately (as usually in prediction problem) d is unknown in 
practice. So, we propose to estimate d by using the fact that we are dealing both with a 
Markov property and inverse regression as follows. 

4.2. Estimation of the number of neighbors necessary for prediction. 

Note that we suppose that the underline process is a stationary Markov process with 
respect to the d— neighbors system of neighborhood, so the variables ^i(fc) and are 
independent as soon as k > d and 

E(6(fc)l6 = z/) = o 

(since (^i) is a stationary zero mean process). 

Futhermore since our estimator (of model f ll.lll ) is based on estimation of E(X|F = 
y) = E(^f l^i = y) = (E(^i(fc)l^i = y)] I < k < d), that allows us to keep only the neighbors 
^i(fc) for which E(^i(fc)|^i = y) 0. Then, an estimation of d is obtained by estimation of 
argmin;.E(^i(fc)|^i = y) = 0. We propose the following algorithm to get this estimator. 
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Algorithm for estimation of d, the number of neighbors. 

(1) Initialization: specify a parameter 6 > (small) and fix a site jo; set k = 1. 

(2) compute rn\y) = , the kernel estimate of r^^\y) = 

E(X(^)|r = y) 

(3) if \ {rn \y)\ > S, then k = k + 1 and continue with Step 2; otherwise terminate and 
d = k. 

Then, we can compute a predictor based on d = k: 
4.3. The dimension reduction predictor. 

To get the predictor, we suggest the following algorithm: 
(1) compute 

j<=n 

r*n{y) 



ieOn.VigCOi, 



iecin,Vi(,cci„ 



(2) compute 



(3) Do the principal component analisys of S^^Sg^n both to get a basis of Im(S~^Se,n) 
and estimation of the D, the dimension of Im($) as suggested in the next remark 

(4) compute the predictor: 

based on data G On); where is the kernel estimate: 

E 6^/.„($n(a^-ef)) 

Remark 4.2. 

(1) The problem of estimation of D in step (4) is a classical problem in dimension 
reduction problems. Several ways exist in the literature. One can for example 
use the eigenvalues representation of the matrix S~^Se, n, the measure of distance 
between spaces as in Li [24| or the selection rule of Ferre p6]. 
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(2) Consitency on the convergence of to can be obtained by sketching both 
result of Section [3] and results Biau and Cadre [5j. 

5. Conclusion 

In this work, we have proposed two dimension reduction methods for spatial modeling. 
The first one is a dimension reduction for spatial regression. It is a natural extension of 
the idea of Li [24] (called Inverse Regression method) for spatially dependent variables 
under strong mixing condition. Then, on one hand, we can say that is a good alternative 
to spatial linear regression model since the link between the variables X and Y is not 
necessarly linear. Futhermore, as raises Li [24], any linear model can be seen as a particular 
case of model (11.11) with g being the identity function and D = 1. On the other hand, as 
in the i.i.d. case, it requieres less data for calculus than spatial non-parametric regression 
methods. 

The second method that we have studied here deals with spatial prediction modelling. 
Indeed, it is more general than kriging method were the gaussian assumption on the X 
is needed. Here, we requier that X belongs to a larger class of random variables (that 
obey to Li [24] 's condition recalled in the introduction). Futhermore, our spatial prediction 
method has the ease of implementation property of the inverse regression methods. Then, 
for example, it allows to estimate the number of neighbors need to predict. That cannot 
do the non-parametric prediction method of Biau and Cadre [5]. 

We have presented here the theoretical framework of our techniques. The next step is 
to apply them on real data. It is the subjet of works under development. 

6. Proofs and Technical Results 

6.1. Deviation Bounds . To show the strong consistency results, we will use the fol- 
lowing Bernstein type deviation inequality: 

Lemma 6.1. Let ("Cv, v G N^) be a zero-mean real-valued random spatial process such 
that each v G (N*)''^ there exists c > verifying 

(6.1) E|Cv|^ < A;!c'=-2E|CvP, Vfc > 2 

for some constant c > 0. Let Sn = J2vein^^- ^^^^ /'^'^ '^^'^^ ^ [1; +oo] (ind each 

n G {W)^and q G (N*)^ such that I < Qi < y each e > 0, 

(6.2) 

r 

- 4(^/^2.,,) )+2"xqxll ( 1 + iiii-!H^j a(|p|)-/<-«) 

where Ml = sup^^j^ ECj. 
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Remark 6.2. Actually, this result is an extension of Lemma 3.2 of Dabo-Niang and Yao [I2] 
for bounded processes. This extension is necessary since in the problem of our interessed, 
assuming the boundness of the processes amounts to assume that the Xi's are bounded. It 
is a restrictive condition which (generally) is incompatible with the cornerstone condition 
of the inverse regression (if X is elliptically distributed for example). 

We will use the following lemma to get the weak consistency and a law of iterated of 
the logarithm as well as for the matrix S (as we will see immediately) than for the matrix 
Eg (see the proofs of results of Section [3l). 

Lemma 6.3. Let {X^, n G N^} be a zero-mean stationary spatial process sequence, of 
strong mixing random variables. 

(1) If E||X|p+'^ < +00 and ^a(n)2T« < oo, for some 5 > 0. Then, 



with = ^i(,^N cov(Xfc,Xi) 
(3) If Eexp7||X|| < oo for some constant 7 > 0, if for all m > , a{u) < ap 
< p < 1 or a{u) = C.u-\ 9 > N then. 



• The first result is obtained by using covariance inequality for strong mixing pro- 
cesses (see Bosq [7]j. Actually, it suffices to enumerate the Xi 's into an arbitrary 
order and sketch the proof in Theorem 1.5 of Bosq [7j. 

• The law of the iterated of the logarithm holds by applying the previous Lemma W7T\ 



6.2. Consistency of the inverse regression. In Section[3l we have seen that the results 
are based on consistency results of the function r(.) which are presented now under some 
regularity conditions on the functions: K{.)^ /(.) and r(.). 

• The kernel function K{.) : R M"*" is a A;— order kernel with compact support 
and satisfying a Lipschitz condition \K (x) ~ K {y)\ < C\x — y\ 




(2) 



If E||X|p+'^ < +00 and X]"(n)^+'^ < 00, for some 6 > 0. Then 





Remark 6.4. 
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• /(.) and r(.) are functions of C''{R) {k > 2) such that supy\f^''\y)\ < Ci and 
supy ||v5*-'^-*(y)|| < C2 for some constants Ci and C2, 
we have convergence result: 

Lemma 6.5. Suppose a{t) < Ct-\ t > 0, 6 > 2N and C > 0. If nhl{\ogn)-^ 0, 
n<i(logn)"i 00 with 9i = then 

(1) (see, m) 

(6.3) SUp,eM|/n(l/)-/(l/)|=Op(^n). 

(2) Furthermore, if -E(||X||) < 00 and = E(||X|p|F = .) is continuous, then 

(6.4) SUP^SrI l<^n(l/) - ^{y) \\ = Op (^n) . 



Remark 6.6. Actually, only the result (16.30 is shown in Carbon et al [TO] but the result 
(16.40 is easily obtained by noting that for all e > 0, 

El \X\ I 

P(suPygK||<^n(2/)-E(^n(y)|| > s) < +P (sup^gR I l<^n (y) -E(^n(y) 1 1 > Vz, \\Xi\\ < On) 

with an = 1] (logfi)^/'^, r] > 0. 

Lemma 6.7. If {Z^) is GSM, n/i3(logn)-i and hhn{\ogh)-^^-^ 00, then 

(6.5) supy^^\fM-f{y)\ = OaA'^n). 

Furthermore, «/ E (exp7 ||X||) < 00 for some 7 > and ip{.) = E(||X|p|F = .) is 
continuous, then 

(6.6) SUPy^^\\!fn{y) - 'fiy)\ \ = Oa.s (^n) . 

Remark. The equality (16. Sp is due to Carbon et al [10]. The proof of the equality (16.61) is 
obtained applying Lemma [6T] and sketching the proofs of Theorem 3.1 and 3.3 of Carbon 
et al [To]. Then it is omitted. 

We will need the following lemma and the spatial block decomposition: 

Lemma 6.8. (Bradley's Lemma in Bosq ^) 

Let {X,Y) be an M'^ x M— valued random vector such that Y G L^(P) for some r G 
[1, +00] . Let c be a real number such that | |F + c| |r > and C, G (0, | |F + c| |r] . Then there 
exists a random variable Y* such that: 

(1) Py* = Py CLnd Y* is independent of X, 

(2) P{\Y*-Y\>i) < ll(r'||>^ + c||.)^'/^'^'+'^ X [a{a{X), a{Y))f^'^''+'\ 
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Let Fu = Cv={[ui]+i,i<i<N), u G M . The following spatial blocking idea here is that of 
Tran [31] and Politis and Romano [26] . 
Let A; = r;.' n ••• f/'^ 1^ Yudu . Then, 

Sn= / ... / Fut^u= J2 

1 < 4 < '^fc 

k = 1,...N 

So, S'n is the sum of 2^ <ii^ <ii^ ■ ■ ■ ^ In terms Ai. And each of them is an integral 
of Yu over a cubic block of side p. Let consider the classical block decomposition: 

(2j,+l)p 

[/(l,n,j)= ^k, 

ki=2jiP+l, l<i<N 
(2ii+l)p 2(ijv+l)p 

?7(2,n,j)= J] Yl 

fci=2jr-ip+l, l<i<Af-l fcjv = (2ijv + l)p+l 
(2j,+l)p 2(ijv-i+l)p (2iiv+l)p 

t/(3,n,x,j)= J] Yl 

ki=2j,p+l, l<i<N-2 fcjv-i = (2ijv-i+l)p+l fcjv=2jjvp+l 
(2ii+l)p 2(jjv_i+l)p 2(ijv+l)p 

f/(4,n,j)= Y E E 

A:i=2j,p+1, l<i<Ar-2 fcjv_i = (2ijv-i+l)p+l fciv = (2jjv+l)p+l 

and so on. Note that 

2{i»+l)p {2i]v+l)p 

f/(2^-\n,j)= Y E ^-^^ 

fc,={2ji+l)p+l, l<i<Ar-l fcjv=2ijvp+l 

Finally, 

2{i>+i)p 

f/(2^n,j)= 5^ Ak. 

fcj=(2j,;+l)p+l, l<j<Ar 

So, 

(6.7) 5„ = J]r(n,z), 

i=l 

with T(n, i) = '£l=l,i=i,...,NU{i,n,i). 

If rij 7^ 2ptj, z = 1, A^, for all set of integers ti, ^at, then a term, say T (n, 2^ + l) 
containing all the A^'s at the end, and not included in the blocks above, can be added 
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(see Tran [3l] or Biau and Cadre [4]). This extra term does not change the result of 
previous proof. 



Proof of Lemma 16. IL 



Using (16.71) it suffices to show that 



(6.8) 



ne 



P |T(n,0|>-^ <2exp 



4i;2(q) 



q +qxll 1 + 



2r/(2r+l) 



for each 1 < i < 2^ . 

Without loss of generality we will show (16.81) for z = 1. Now, we enumerate (as it is 
often done in this case) in arbitrary way the c[ = qi x q2 x ■ ■ ■ x qjsf terms f/(l, n, j) of sum 
of T(n, 1) that we call Wi, ...,Wq. Note that the f/(l,n,j) are measurable with respect 
to the (T— field generated by Y■^^ with u such that 2jip < Ui < (2jj + l)p, i = 1, ...,N. 

These sets of sites are separated by a distance at least p and since for all m = 1, q 
there exists a j(m) such that Wm = ^(l,n,j(m)) which have the same distribution as 
W* 



Noting that 

|.(2ifc(m)+l)p 



^w;j = E 



(2ii(m)+l)p 



2ji {m)p 



(2jiv(m)+l)p 



Y,,du 



r e [1, +oo] 



du 



[2jfc(m)p]+l 
2jfc (m)p 



[(2ifc(m)+l)p] /.2ifc(m)+l)p 



[{2jk{m)+l)p] 

= ([2jfc(m)p] + 1 - 2jk{m)p) C(v,^,=[2ife(m)p]+i) + ^ 

v^, = [2j^:{m)p]+2 

+ ((2jfc(m) + l)p - [(2jfc(m) + l)p]) C(v,^,=[{2j,(m)+i)p]+i) 

[{2jfc(m)+l)p]+l 

w(j,v)fcCv 

«fc = [2jfe(m)p]+l 

and |w(j, v)fc| < 1 Vfc = 1, A^, we have by using Minkovski's inequality and 16. II one get 



(6.9) 



E 



p 



N 



< c'-V!M|,Vr > 2. 
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Then, using recursively the version of Bradley's lemma gives in Lemma 16.81 we define 
independent random variables W^, Wf such that for all r G [1, +oo] and for all m = 
1, q, has the same distribution with Wm and setting = ]9^^c''~^M|, we have: 

r/(2r+l) 



p(|i^^-i^;i>0<ii 



\W.m + UJrWr 



a{[p]) 



2r/(2r+l) 



where, c = 6uJrP and ^ = min ( jntt^j ~ l)uJrP ) = min ( {6 — l)uJrP ) for some 



6 > 1 specified below. Note that for each m. 



\Wm + C\\r > C 



\W„ 



>{6- l)uJrp'^ > 



so that < ,^ < \ \Wm + c\\r as required in Lemma [621 



Then, if 5 = 1 + ^, 



pi\Wm-w:,\>o<ii{i + 



AuJr 



r/(2r+l) 



ai[p]) 



2r/(2r+l) 



and 



\m=l 



r/(2r+l) 



a{\p\) 



2r/(2r+l) 



Now, note that Inequality (16. 9p also leads (by Bernstein's inequality) to : 



P 



/ J ' ' n 



m=l 



ne , 

>^) <2exp 



4 VjW^ 4- £BPf^r 



Thus 



P(|r(n,l)|>||) < 2exp(-^pg|^j+qxll 1 + 



4cp"M, 



r/(2r+l) 



am 



2r/(2r+l) 



Then, since q = gi x ... x g^v and n = 2^p^q, we get inequality (16. Sp the proof is 



completed by noting that P{\Sn\ > iie) < 2~P(|T(n,i)| > |^). 



□ 



6.3. Proof of the Theorem 13.11 We will prove the desired result on „ — Eg using 
an intermediate matrix 

Se,„ = ^5^r(r0r(rif. 



ieXn 



Start with the following decomposition 



y —Y — y 



We first show that: 
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To this aim, we set : 

(6.11) ^e,n — ^e,n — Sn,l + Sn,2 + Sn,3 

with 



ieXn 

and 



n 

iex„ 



Note that 5*^ 3 = -S'n_2, hence we only need to control the rate of convergence of the first 
two terms i and 2 



We will successively prove that 



and 

Sr. ,. = a ( ^ + /i' 



n,2 — \ T /(-n 



this latter will immediately implies that 



• Control on 8^,1 
Since for each y e R : 

(6.12) f^M - r{y) = -p^^ im - feM) + TT-^ ^^"^^^ " ^^^^^ 

/en(y) /e„(y) 

and 

(6.13) f{y) - hM = f{y) - Uy) + (/n(l/) - en)l{/„(.)<en}, 
for each i e (N*)^ 

WreAYi) - r{YO\\ < ||/„ - /||oo + 2 ||r(FOII l{/„«)<e„} + 



and 

|2 



|re„(V'i)-r(ri)||'<3 



^(^i)ll -2 +4||r(ri)|| l{/„(yo<e„} + 



|2 n 

1 00 



KERNEL INVERSE REGRESSION FOR SPATIAL RANDOM FIELDS. 20 

Using the following inequality (see Ferre and Yao [T7] for details) : 
(6-14) l{/n(yi)<e„} < l{/(yi)<en} + 



I oo 



e 



2 



and by results on Lemmas 16.31 and [675l we have: 



Now, noting that 

|2 

■i)<en}5 



we have (since E ( '/(yfp l{/(yi)<en}) = (^) by assumption): 

(6.15) I Yl ll^(^i)ll' l{/(^0<en} = O, 

and 

because of Assumption E (^■jpp-i{f{Y)<en}^ = O (-i^). 



Now, since = + ^ and ^ ^ y tl^ ^ ^^^^'^ C > an arbitrary 
constante), we have: 

(6.16) = 0J^ 



M e2 r 



• Control on Sn,2 ■ 

Noting that : ^ = i + ^ + 4a^= i + + ^ with ^ = max{/,en}, 

J En J Jen J Jen J en J Jen J Jen Jen 

we have: 



= 1 y: '^i^^ im - fenm + (^n(yi) - ^ir^f 



where 



/?„,(!!) = r(yi) r(ro^(/(ri)-/„(ii)) + (v^n(ii)-v^(ri))^ 



1 1 ^/e„(>^i)-/e.(>i) 



m /e.(rO/e„(li) 
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and 



"2 



Futhermore : 

• since for all y G M we have 



< 4- and by several calculus we also have 



fe^y) - feAv) < \f{y) - fn{y)\ and then ||/e„ - /e„||oo < ll/n " /||oo , wc also 
lave one hand: 



(6.17) < (IK^OII ll^n - (/^lloo + \\r{YO\\' ll/n - /lloo) 



1 , , ll/n- 



fiYi) 



on the other hand we have 



< ;^2^lr(^i)ll TwTTx l{/nm)<en} 



{Y,)<en}- 



because for all ?/ G M , \fn{y) - fe^{y)\ = \fn{,y) - e^l l{/„(y)<e„} < 2enl{/„(y)<e„}. 
Then, it follows from ([6l^ and 16151) that: 



as for Sni, we deduce: 



Now, observious that. 



Rn2 



\r{Yd\f 



n ^ 

ieXn 

we have (as previously): 
(6-18) 

i< 

Moreover, since E 1 
(6-19) 



k(ii)|p 



4/(^i)<en}> 



{/(^)<en} 



4/(^i)<en} 



O , we also have: 



iGXn 



vm\ 



-{/(yi)<en} 



Or. 



n 2 



So combining (IHTfl) . fl618ll and SKW\ . we get: 



n 2 
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and since -fp- C ^^j (for n large) we have: 



et 



Then, 



P' 



with 



and 



To finish, we are going to show that 



Note that: 



n ^-^ n„ 



1=1 



where r(.) is a function defined by T{y) = "^''^Y^^y^^ for y G M and 



is a second-order Von Mises functional statistic which associated U-statistic 

= ^^7^ E + -mKuAy^ - y^- 

Since: K = f/„ + Op(4), 



i=l ^ ' 



We apply Lemma O with, m = 2, h{yi,y2) = [r(yi) + r(t/2)] - 1/2 ) 

and 

e(F) = E (/ii(r)) = E {r{y).f * iT.Jy)) . 
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Since 

\\h{YuY2)\U+5<C.\\r{Y)\U+5<^, 

by assumption (13.ip then, 

f/n = e(F) + I 5^ (/ii(Fi) - 0(F)) + Op(i). 

i 

and 



(1) 

n, 2 



1=1 



hy 



Ylhr 



i=l 



f*Kf,^{Y,)\ , 0(F) - (r./) * ir,JyO 



Since / and r(.) belongs to C^(M), we get 



and 



Then, we have 



ir.f)*K,M 



hr 



f{y) 



-ir.fM 



hr 



/in 
0(F) 



mr.f){Y)) + 0{hl). 



Finally: 



oM + 



By using similar arguments and applying Lemma [2.11 with m = 3, one also gets 

1 . 



So, 



5S = f.('4 + ,, 



, 1 

^n,2 = a ( ^ + + 



Then, equality, (I616D . and fICTD lead to fICTD . 
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Recall that '^n = + y • Then, the fact that there exist a real A > such that 
Vn> A, ^ < and : 

' nhn n/i„e^ 

(6.20) Sn,2 = oJ'^ + hi 



e2 



Finally, using equality (IG.lOp one has; 



(6.21) Se,n-Se = S,,„-Se + OpM^ + /i 



2 



To complete the proof, we will use Lemma [Ql To this aim, it suffices to choose 6 = 5 
with 5 > 2N then E||X||'^+'' < oo and X]fcQ^(^)^ < hence we have: 



n 



which ends the proof. □ 



6.4. Proof of corollary [Q 

The proof is achieved by replacing — and — n^'^^ with ^ + ^<ci<| — 2c2 
on equality (16.2ip □ 



6.5. Proof of corrollary I3.3L 

Chosing /in — n^^i and — n"'^^ where ^ + ^<ci<| — 2c2 on equality (16.211) . one 
gets Se,n — = Se n — Se + Op(-^) and the central limit theorem for spatial data and 
Slusky's theorem completes the proof. 



Proof of Theorem 13. 4L Let 



pact set, J is bounded and replace the assumption 
E (exp (||r(F)|| l{/(y)<e„})) = O (n"«) for some ^ > 0. Then 



( °^ ?^ " ) ^ note that since Y take place on a com- 

by 



r(Y) 

7{yyi{/m<e„} 



P 



4/(^i)<en} 











f n y 





and because of Minskovski's inequality: for alU G N*, E (^(iEieXn lk(^i)ll l{/(>'i)<en}) j < 
||r(F) l{/(y)<e^} , we can say that with using the argument E (exp (||r(F) || l{/(y)<en})) = 
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O (n-€) : 



P 



^ — « I / log log 1 r n 

< Ci n ^. exp I —e \ ) | tor someGi > U. 



< Ci exp — ^ log h — £ 



1 ' 

n N 2 



log log n 



< Ci exp ( — min(^,e) ( logn + ^ " 



< Ci exp — min(^,£:) logn 1 + 



Vloglogn 
1 



a/ (logn) log log 

as n — > +CXD, exp I — min(^,e) logn I 1 H — , ^ = ] ] ^ n^'"^ where C2 is positive 

constant. So, i Zliex„ 7^1{/(>'i)<en} = Oa.s (^( '"^^"^ " ) ' j and the proof is complet by 
using Lemma 16.71 and sketching the proof of Theorem 13.11 □ 

Proof of Corollary 13. 5L If moreover we chose — n~'^^ and Cn — n~'^^ where ^ + ^ < 
ci < I — 2c2, then, 



log logn y log logn 



- X ^ = " . - + fi- 2+^i+^'=2 log n this latter tend to zero as soon as + < 



loglogn el ^log logn ^ k 4k 

Ci < i - 2C2 

The proof is obtained by sketching the proof of Corolary 13.21 and using the law of the 
iterated logarithm recalled in Lemma [6.31 
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